LQ control example

This Notebook illustrates the following libraries:

  • symbolic math with SymPy (http://sympy.org)
  • numerical regulator synthesis with python-control (http://python-control.sourceforge.net/manual/)
  • (and a bit of plotting with matplotlib, http://matplotlib.org/)

...applied on the example of a Linear Quadratic control problem.

PH, December 2014

1) Compute the optimal LQ control symbolically

In [1]:
import sympy
from sympy import symbols, simplify, Eq, solve
In [2]:
sympy.init_printing()

integrator system, in discrete time:

\[x_{k+1} = x_k + u_k\]

In [3]:
x, u = symbols('x_k u_k')
x_next = x+u
x_next
Out[3]:
$$u_{k} + x_{k}$$

Instaneous penalty (quadratic cost)

\[ c(x,u) = x^2 + r. u^2 \]

In [4]:
r = symbols('r')
c = x**2 + r*u**2
c
Out[4]:
$$r u_{k}^{2} + x_{k}^{2}$$

The optimal control is solution of Bellman equation:

\[J(x) = \min_u c(x,u) + J(f(x,u)) \]

with \(J(x)\) known to be quadratic: \(J(x) = V x^2\)

In [6]:
V = symbols('V')
# left-hand side
Bell_lhs = V*x**2
# right-hand side
Bell_rhs = c+V*x_next**2
Bell_rhs
Out[6]:
$$V \left(u_{k} + x_{k}\right)^{2} + r u_{k}^{2} + x_{k}^{2}$$

Minimize the right-hand side, by solving \(d\cdot/du = 0\)

In [7]:
Bell_rhs.diff(u)
Out[7]:
$$V \left(2 u_{k} + 2 x_{k}\right) + 2 r u_{k}$$
In [9]:
solve(Bell_rhs.diff(u), u)
Out[9]:
$$\begin{bmatrix}- \frac{V x_{k}}{V + r}\end{bmatrix}$$

The optimal control is a linear feedback:

\[u^* = -K^* x\]

In [10]:
u_opt = solve(Bell_rhs.diff(u), u)[0]
K_opt = -u_opt/x
K_opt
Out[10]:
$$\frac{V}{V + r}$$

but, we still need to find \(V\)...

→ Reinject the optimal feedback in Bellman equation

In [12]:
Bell_rhs_opt = Bell_rhs.subs(u, u_opt)
Bell_rhs_opt
Out[12]:
$$\frac{V^{2} r x_{k}^{2}}{\left(V + r\right)^{2}} + V \left(- \frac{V x_{k}}{V + r} + x_{k}\right)^{2} + x_{k}^{2}$$

Solve for \(V\) (Bellman eq. simplifies to a 2nd order polynomial root finding)

In [13]:
solve(Bell_lhs-Bell_rhs_opt, V)
Out[13]:
$$\begin{bmatrix}- \frac{1}{2} \sqrt{4 r + 1} + \frac{1}{2}, & \frac{1}{2} \sqrt{4 r + 1} + \frac{1}{2}\end{bmatrix}$$

And we know that \(V>0\) and \(r>0\), so we only keep the positive solution

In [14]:
V_opt = solve(Bell_lhs-Bell_rhs_opt, V)[1]
V_opt
Out[14]:
$$\frac{1}{2} \sqrt{4 r + 1} + \frac{1}{2}$$

Final expression of the optimal gain:

In [15]:
K_opt = simplify(K_opt.subs(V, V_opt))
K_opt
Out[15]:
$$\frac{\sqrt{4 r + 1} + 1}{2 r + \sqrt{4 r + 1} + 1}$$

(notice: \(K^*\) is equivalent to \(\frac{1}{\sqrt{r}}\) when \(r\to \infty\))

Special values of \(K^*\):

In [21]:
(K_opt.subs(r, 0),
 K_opt.subs(r, 1).evalf(3),
 K_opt.subs(r, 10).evalf(3))
Out[21]:
$$\begin{pmatrix}1, & 0.618, & 0.27\end{pmatrix}$$

2) Plot the gain \(K^*\), as a function of the control penalty \(r\)

Transform the symbolic expression in a numerical function, to be evaluated and plotted

In [18]:
import numpy
from numpy import array, sqrt
In [19]:
K_opt_r = sympy.lambdify(r, K_opt, numpy)
In [20]:
K_opt_r(array([0, 1, 10]))
Out[20]:
array([ 1.        ,  0.61803399,  0.27015621])
In [22]:
import matplotlib.pyplot as plt
%matplotlib inline
In [23]:
r_list = numpy.logspace(-2,3.9, 100)
plt.loglog(r_list, K_opt_r(r_list), label='Optimal gain $K^*$')
plt.loglog(r_list, 1/sqrt(r_list), 'r--', label='$1/\sqrt{r}$ asymptote')

plt.grid(True)
plt.xlabel('Control penalty r')
plt.ylim(ymax=2)
plt.legend(loc='lower left');

3) Numerical control synthesis

Using python-control to find numerically the optimal feedback (http://python-control.sourceforge.net/manual/synthesis.html).

(Based on SLICOT Fortran routines http://slicot.org/ )

→ requires the control and slycot packages (and installing slycot requires a Fortran compiler, like gfortran on Linux, and lapack-dev for the lapack headers).

In [24]:
import control

Linear quadratic regulator design with lqr (http://python-control.sourceforge.net/manual/synthesis.html#control.lqr)

problem: only for continous time. Discrete time not yet implemented (→ student project?).

In [27]:
control.lqr??

Integrator, in continuous time:

\[\dot{x} = u\]

In [28]:
A = 0
B = 1

Q = 1
R = 100
In [29]:
K, S, E = control.lqr(A, B, Q, R)
In [30]:
K
Out[30]:
array([[ 0.1]])
In []:
1/numpy.sqrt(R)

Recompute the optimal gain symbolically

In continuous time, the optimal control is solution of the Hamilton-Jacobi-Bellman (HJB) equation

\[0 = \min_u(c(x,u) + \frac{\partial J}{\partial x} f(x,u))\]

(in steady state)

In [31]:
x, u = symbols('x u')
x_der = u
x_der
Out[31]:
$$u$$
In [32]:
r = symbols('r')
c = x**2 + r*u**2
c
Out[32]:
$$r u^{2} + x^{2}$$
In [33]:
V = symbols('V')
J = V*x**2

HJB = c + J.diff(x)*x_der
HJB
Out[33]:
$$2 V u x + r u^{2} + x^{2}$$

minimize with respect to \(u\):

In [34]:
solve(HJB.diff(u), u)
Out[34]:
$$\begin{bmatrix}- \frac{V x}{r}\end{bmatrix}$$
In [35]:
u_opt = solve(HJB.diff(u), u)[0]
K_opt = -u_opt/x
K_opt
Out[35]:
$$\frac{V}{r}$$

Solve for V

In [36]:
solve(HJB.subs(u, u_opt), V)
Out[36]:
$$\begin{bmatrix}- \sqrt{r}, & \sqrt{r}\end{bmatrix}$$
In [37]:
V_opt = solve(HJB.subs(u, u_opt), V)[1]
V_opt
Out[37]:
$$\sqrt{r}$$

Optimal gain:

In [38]:
K_opt = K_opt.subs(V, V_opt)
K_opt
Out[38]:
$$\frac{1}{\sqrt{r}}$$

Observation: the optimal gains in discrete time and continuous time match for big values of \(r\) (i.e. for systems with slow dynamics).